Predicting Depression for Japanese Blog Text
نویسنده
چکیده
This study aims to predict clinical depression, a prevalent mental disorder, from blog posts written in Japanese by using machine learning approaches. The study focuses on how data quality and various types of linguistic features (characters, tokens, and lemmas) affect prediction outcome. Depression prediction achieved 95.5% accuracy using selected lemmas as features.
منابع مشابه
Design, Compilation, and Preliminary Analyses of Balanced Corpus of Contemporary Written Japanese
Compilation of a 100 million words balanced corpus called the Balanced Corpus of Contemporary Written Japanese (or BCCWJ) is underway at the National Institute for Japanese Language and Linguistics. The corpus covers a wide range of text genres including books, magazines, newspapers, governmental white papers, textbooks, minutes of the National Diet, internet text (bulletin board and blogs) and...
متن کاملAutomatic Evaluation of Commonsense Knowledge for Refining Japanese ConceptNet
In this paper we present two methods for automatic common sense knowledge evaluation for Japanese entries in ConceptNet ontology. Our proposed methods utilize text-mining approach, which is inspired by related research for evaluation of generality on natural sentences using commercial search engines and simpler input: one with relation clue words and WordNet synonyms, and one without. Both meth...
متن کاملMultimedia Blog Creation System using Dialogue with Intelligent Robot
A multimedia blog creation system is described that uses Japanese dialogue with an intelligent robot. Although multimedia blogs are increasing in popularity, creating blogs is not easy for users who lack highlevel information literacy skills. Even skilled users have to waste time creating and assigning text descriptions to their blogs and searching related multimedia such as images, music, and ...
متن کاملUsing a Chunk-based Dependency Parser to Mine Compound Words from Tweets
New words are appearing everyday in online communication applications, such as Twitter1. Twitter is the world’s most famous online social networking and microblogging service that enables its users to send/read text-based messages of up to 140 characters, known as “tweets”. Due to the facts that tweets are online typed (as fast as possible) within a limited number of characters, tweets are full...
متن کاملAutomatically Annotating A Five-Billion-Word Corpus of Japanese Blogs for Affect and Sentiment Analysis
This paper presents our research on automatic annotation of a five-billion-word corpus of Japanese blogs with information on affect and sentiment. We first perform a study in emotion blog corpora to discover that there has been no large scale emotion corpus available for the Japanese language. We choose the largest blog corpus for the language and annotate it with the use of two systems for aff...
متن کامل